The Japanese Lexical Transducer Based on Stem-Su x Style Forms
نویسندگان
چکیده
This paper depicts an optimal method to construct a lexical transducer for Japanese by describing the stems and su xes in di erent lexicons separately and adding an extra level of the transducers for transducing between canonical citation forms and stem-su x style forms. This method makes it possible to reduce the complexity of rule descriptions and the computational load of intersecting compared with other methods. We made the full-size lexical transducer for Japanese. The number of states is about 60 thousand and the number of arcs is about 300 thousands. The physical size is from 800KB to 1.5MB depending on compaction methods.
منابع مشابه
The Japanese lexical transducer based on stem-suffix style forms
This paper depicts an optimal method to construct a lexical transducer for Japanese by describing the stems and suuxes in diierent lexicons separately and adding an extra level of the transducers for transducing between canonical citation forms and stem-suux style forms. This method makes it possible to reduce the complexity of rule descriptions and the computational load of intersecting compar...
متن کاملArabic Finite-State Morphological Analysis and Generation
lexical level: drs+FormI+Perfect+Active Abstract intermediate level: Abstract intersected level: daras drs+CVCVC+aa Figure 5: Root drs with CVCVC Template and Active Voweling Abstract lexical level: Abstract intermediate level: Abstract intersected level: drs+FormI+Perfect+Passive drs+CVCVC+ui duris Figure 6: Root drs with CVCVC Template and Passive Voweling de ned as single symbols, and if +Fo...
متن کاملComputing with Realizational Morphology
The theory of realizational morphology presented by Stump in his influential book Inflectional Morphology (2001) describes the derivation of inflected surface forms from underlying lexical forms by means of ordered blocks of realization rules. The theory presents a rich formalism for expressing generalizations about phenomena commonly found in the morphological systems of natural languages. Thi...
متن کاملDesigning spelling correctors for inflected languages using lexical transducers
This paper describes the components used in the design of the commercial X u x e n I I spelling checker/corrector for Basque. It is a new version of the Xuxen spelling corrector (Aduriz et al., 97) which uses lexical transducers to improve the process. A very important new feature is the use of user dictionaries whose entries can recognise both the original and inflected forms. In languages wit...
متن کاملJapanese Dependency Analysis using a Deterministic Finite State Transducer
A deterministic finite state transducer is a fast device for analyzing strings. It takes O(n) time to analyze a string of length n. In this paper, an application of this technique to Japanese dependency analysis will be described. We achieved the speed at a small cost in accuracy. It takes about 0.17 millisecond to analyze one sentence (average length is 10 bunsetsu, based on PentiumIII 650MHz ...
متن کامل